Skip to content

fix(parser): enhance handling of custom tags and markdown syntax#1904

Open
meet-student wants to merge 2 commits into
ant-design:mainfrom
meet-student:patch
Open

fix(parser): enhance handling of custom tags and markdown syntax#1904
meet-student wants to merge 2 commits into
ant-design:mainfrom
meet-student:patch

Conversation

@meet-student
Copy link
Copy Markdown
Member

@meet-student meet-student commented May 19, 2026

中文版模板 / Chinese template

🤔 This is a ...

  • 🆕 New feature
  • 🐞 Bug fix
  • 📝 Site / documentation improvement
  • 📽️ Demo improvement
  • 💄 Component style improvement
  • 🤖 TypeScript definition improvement
  • 📦 Bundle size optimization
  • ⚡️ Performance optimization
  • ⭐️ Feature enhancement
  • 🌐 Internationalization
  • 🛠 Refactoring
  • 🎨 Code style optimization
  • ✅ Test Case
  • 🔀 Branch merge
  • ⏩ Workflow
  • ⌨️ Accessibility improvement
  • ❓ Other (about what?)

🔗 Related Issues

  • Describe the source of related requirements, such as links to relevant issue discussions.
  • For example: close #xxxx, fix #xxxx

💡 Background and Solution

  • The specific problem to be addressed.
  • List the final API implementation and usage if needed.
  • If there are UI/interaction changes, consider providing screenshots or GIFs.

📝 Change Log

Language Changelog
🇺🇸 English enhance handling of custom tags and markdown syntax
🇨🇳 Chinese -

Summary by CodeRabbit

  • New Features

    • 新增 rawCustomComponents 配置,允许将已注册为组件的 HTML 标签内部内容以原始纯文本传入组件(默认关闭)。
  • Bug Fixes

    • 改进解析器对自定义组件内内容的处理,避免将类似 Markdown 的文本误解析或生成链接,未闭合标签场景也能保持为文本。
  • Tests

    • 扩展测试覆盖,包含多种自定义组件与流式解析场景的行为断言。
  • Documentation

    • 补充组件文档与示例,说明 rawCustomComponents 的用途与默认值。

Review Change Stack

@dosubot dosubot Bot added the bug Something isn't working label May 19, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

📝 Walkthrough

Walkthrough

新增 Parser 的 rawCustomComponents 选项并重构自定义标签保护逻辑:引入原生 HTML 标签识别、用占位符封装完整标签/内容、始终先保护再调用 marked 解析并恢复占位符;同时补充对应单元测试、集成测试、类型与文档更新。

总体概览

该PR重构了Markdown自定义标签的保护机制,引入原生HTML标签识别,改进占位符策略,确保自定义组件内容始终作为纯文本透传而不被Markdown解析器处理。

变更

自定义标签保护机制

Layer / File(s) Summary
原生标签识别与占位符基础设施
packages/x-markdown/src/XMarkdown/core/Parser.ts
新增 ParserOptions.rawCustomComponentsNATIVE_HTML_TAGS 常量集合(38-152行)与 CustomTagPlaceholder 类型(173-176行),并在收集需保护的标签时区分原生标签与自定义组件。
protectCustomTags核心逻辑重写
packages/x-markdown/src/XMarkdown/core/Parser.ts
新增 createPlaceholder(348-353行)并重写 protectCustomTags(377-426行),将完整开/内/闭标签封装为占位符,未闭合标签也以占位符返回;对 rawCustomComponents=true 时对内部内容执行转义。
占位符恢复与parse流程更新
packages/x-markdown/src/XMarkdown/core/Parser.ts
restorePlaceholders 改为逐一 split/join 替换并更新 parse 流程为:无条件执行 protectCustomTags → 调用 marked 解析 → restorePlaceholders(487-489行)。
Parser单元测试验证
packages/x-markdown/src/XMarkdown/__tests__/Parser.test.ts
新增/更新 protectCustomTagNewlines 组测试(84-149、161-169行),覆盖自定义组件内部的文本保持、未闭合标签处理、原生 HTML 标签在不同配置下的行为断言。
集成测试与 streaming 场景
packages/x-markdown/src/XMarkdown/__tests__/index.test.tsx
新增集成与 streaming 测试(179-246行),断言组件收到的 props.children 为原始纯文本且不触发链接/格式化元素,验证未闭合标签时的接收行为,并测试 rawCustomComponents 对原生标签映射的影响。
组件接入与类型
packages/x-markdown/src/XMarkdown/index.tsx, packages/x-markdown/src/XMarkdown/interface.ts
XMarkdown 解构 props 增加 rawCustomComponents 并传入 Parser(24、62-76行);在 XMarkdownProps 中新增 rawCustomComponents?: boolean(182-187行),并调整导出类型顺序(203-209行)。
文档与示例
packages/x/docs/x-markdown/*
在中英文组件文档与示例中新增 rawCustomComponents 的用法说明与 API 表项(多个文件各 1-2 行)。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • ant-design/x#1688: 与本次对 protectCustomTags/restorePlaceholders 区域的重构有关联,提出了早期的自定义标签保护实现调整思路。

Suggested labels

lgtm, BranchAutoMerge

Suggested reviewers

  • Div627

兔语诗篇

🐰 占位符藏身处,原生与自定义分,
先护后解析稳如山,
流式未闭也不慌,
文档测试皆更新,
小兔跳跃庆修补!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed 标题清晰准确地概括了主要变更:增强对自定义标签和 Markdown 语法的处理能力。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the markdown parser to protect custom component tags from being processed as markdown, ensuring their content is treated as plain text. Key changes include the introduction of a native HTML tag exclusion list, support for protecting unclosed tags during streaming, and a transition to a more robust placeholder system. Feedback highlights a performance concern in the restorePlaceholders method, where a regex-based approach is recommended over the current iterative split().join() implementation. Additionally, it was noted that the protectCustomTagNewlines option is now bypassed, potentially causing unintended breaking changes for users who rely on markdown parsing within custom components.

Comment on lines +412 to +416
let restored = content;
placeholders.forEach((value, placeholder) => {
restored = restored.split(placeholder).join(value);
});
return restored;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of restorePlaceholders is inefficient for large documents or many placeholders because it performs a full string traversal (split().join()) for every single placeholder. Since placeholders are unique and non-overlapping, using a single regex replacement is much more performant.

    return content.replace(/\u0000XMDPLACEHOLDER\d+\u0000/g, (match) => {
      return placeholders.get(match) ?? match;
    });

Comment on lines +477 to +479
const { protected: protectedContent, placeholders } = this.protectCustomTags(content);
const parsed = this.markdownInstance.parse(protectedContent) as string;
return this.restorePlaceholders(parsed, placeholders);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The protectCustomTagNewlines option is now effectively ignored as protectCustomTags is called unconditionally in the parse method. This represents a significant change in behavior: markdown syntax will no longer be parsed inside any custom components provided in the components prop.

If this is the intended new default behavior, the protectCustomTagNewlines prop should be deprecated or its documentation updated to reflect that it no longer controls this protection. Otherwise, the logic should respect the flag to allow users to opt-out of this behavior if they want markdown parsing within their custom components.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Bundle Report

Changes will increase total bundle size by 185.07kB (5.63%) ⬆️⚠️, exceeding the configured threshold of 5%.

Bundle name Size Change
x-markdown-array-push 1.37MB 2.17kB (0.16%) ⬆️
antdx-array-push 2.11MB 182.9kB (9.5%) ⬆️⚠️

Affected Assets, Files, and Routes:

view changes for bundle: x-markdown-array-push

Assets Changed:

Asset Name Size Change Total Size Change (%)
latex.min.js 2.17kB 264.89kB 0.83%
latex.min.css -6 bytes 24.39kB -0.02%
view changes for bundle: antdx-array-push

Assets Changed:

Asset Name Size Change Total Size Change (%)
antdx.min.js 182.9kB 2.11MB 9.5% ⚠️

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/x-markdown/src/XMarkdown/core/Parser.ts (1)

290-405: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

这里的占位符扫描会误伤代码片段,而且开标签匹配也不够稳。

protectCustomTags() 现在直接在原始 markdown 上做全局正则扫描,没有跳过 fenced code / inline code;像 `<Demo>**x**</Demo>` 这类示例会先被替换成占位符,marked 生成 <code>...</code> 后又被 restorePlaceholders() 还原成真实标签,最终代码示例不再被转义。另外,<Demo title="a > b"> 这种属性值里带 > 的开标签也会被提前截断。这里最好改成基于 token 的保护,或者至少先排除 code span / code block 再匹配标签。

Also applies to: 408-416

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/x-markdown/src/XMarkdown/core/Parser.ts` around lines 290 - 405,
protectCustomTags currently scans raw markdown and mis-identifies custom tags
inside code spans/blocks and breaks open-tag matching when attribute values
contain >; update protectCustomTags to first detect and replace fenced code
blocks and inline code spans with temporary placeholders (reused by
restorePlaceholders) before running the custom-tag scan, or switch to a
token-based approach if a Markdown tokenizer is available; also strengthen the
openTagRegex used in protectCustomTags to allow attributes with quoted " or '
characters (e.g., match attributes with a pattern that accepts quoted strings)
so opening tags like <Demo title="a > b"> are not truncated, and continue to use
createPlaceholder / placeholders map for protected content so
restorePlaceholders can reinstate originals.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/x-markdown/src/XMarkdown/__tests__/Parser.test.ts`:
- Around line 123-132: The protectCustomTagNewlines option is unused: either
remove it from ParserOptions and from all constructor invocations (including
tests like the cases in Parser.test.ts that pass protectCustomTagNewlines) and
delete the two redundant tests, or explicitly mark it deprecated by adding a
JSDoc `@deprecated` to the protectCustomTagNewlines field in the
ParserOptions/type and update the tests to assert deprecation (or keep a comment
noting it's noop); ensure you also remove any references in the Parser
constructor signature and any default options handling (symbols to check:
Parser, ParserOptions, protectCustomTagNewlines, protectCustomTags()) so the
code and tests stay consistent.

In `@packages/x-markdown/src/XMarkdown/core/Parser.ts`:
- Around line 477-479: The parser currently always calls
protectCustomTags/restorePlaceholders inside parse(), ignoring the
ParserOptions.protectCustomTagNewlines flag and causing a silent breaking
change; update parse() to read the ParserOptions.protectCustomTagNewlines (or
this.options.protectCustomTagNewlines) and only run the protect/restore flow
(calls to protectCustomTags and restorePlaceholders) when that flag is true,
otherwise skip those calls and pass content straight to markdownInstance.parse;
keep references to the existing methods protectCustomTags, restorePlaceholders,
and the ParserOptions property protectCustomTagNewlines so callers' behavior
remains configurable.

---

Outside diff comments:
In `@packages/x-markdown/src/XMarkdown/core/Parser.ts`:
- Around line 290-405: protectCustomTags currently scans raw markdown and
mis-identifies custom tags inside code spans/blocks and breaks open-tag matching
when attribute values contain >; update protectCustomTags to first detect and
replace fenced code blocks and inline code spans with temporary placeholders
(reused by restorePlaceholders) before running the custom-tag scan, or switch to
a token-based approach if a Markdown tokenizer is available; also strengthen the
openTagRegex used in protectCustomTags to allow attributes with quoted " or '
characters (e.g., match attributes with a pattern that accepts quoted strings)
so opening tags like <Demo title="a > b"> are not truncated, and continue to use
createPlaceholder / placeholders map for protected content so
restorePlaceholders can reinstate originals.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b6859f1f-9fb7-453f-8f0b-ea6c6e4159fa

📥 Commits

Reviewing files that changed from the base of the PR and between 557c127 and f1cd335.

📒 Files selected for processing (3)
  • packages/x-markdown/src/XMarkdown/__tests__/Parser.test.ts
  • packages/x-markdown/src/XMarkdown/__tests__/index.test.tsx
  • packages/x-markdown/src/XMarkdown/core/Parser.ts

Comment thread packages/x-markdown/src/XMarkdown/__tests__/Parser.test.ts
Comment thread packages/x-markdown/src/XMarkdown/core/Parser.ts
@Div627
Copy link
Copy Markdown
Contributor

Div627 commented May 19, 2026

@meet-student Hi,x-markdown 当前遵循 CommonMark 规范,这里的行为符合预期,并非 bug。根据规范,非块级 HTML 标签中的内容仍会继续按 Markdown 解析

如果你的诉求是让 HTML 标签内部内容按纯文本处理(例如不再继续解析 Markdown),可以考虑增加一个可选配置:开启后基于传入的 components 做白名单过滤,对命中的标签按自定义规则处理。这样既能保持规范兼容,也能覆盖这类特殊场景。

@meet-student
Copy link
Copy Markdown
Member Author

Hi,x-markdown 当前遵循 CommonMark 规范,这里的行为符合预期,并非 bug。根据规范,非块级 HTML 标签中的内容仍会继续按 Markdown 解析

如果你的诉求是让 HTML 标签内部内容按纯文本处理(例如不再继续解析 Markdown),可以考虑增加一个可选配置:开启后基于传入的 components 做白名单过滤,对命中的标签按自定义规则处理。这样既能保持规范兼容,也能覆盖这类特殊场景。

那 定义个 api ??

@meet-student
Copy link
Copy Markdown
Member Author

@meet-student Hi,x-markdown 当前遵循 CommonMark 规范,这里的行为符合预期,并非 bug。根据规范,非块级 HTML 标签中的内容仍会继续按 Markdown 解析

如果你的诉求是让 HTML 标签内部内容按纯文本处理(例如不再继续解析 Markdown),可以考虑增加一个可选配置:开启后基于传入的 components 做白名单过滤,对命中的标签按自定义规则处理。这样既能保持规范兼容,也能覆盖这类特殊场景。

既然已经定义成自定义组件的 html , 是不是就该自定义组件下的内容是纯字符串??? 由用户去处理.

没有自定义组件的 内容中的 标签 html 标签则过滤?

非自定义组件中的 内容遵循 CommonMark 规范,这里的行为符合预期.

@Div627
Copy link
Copy Markdown
Contributor

Div627 commented May 19, 2026

@meet-student Hi,x-markdown 当前遵循 CommonMark 规范,这里的行为符合预期,并非 bug。根据规范,非块级 HTML 标签中的内容仍会继续按 Markdown 解析
如果你的诉求是让 HTML 标签内部内容按纯文本处理(例如不再继续解析 Markdown),可以考虑增加一个可选配置:开启后基于传入的 components 做白名单过滤,对命中的标签按自定义规则处理。这样既能保持规范兼容,也能覆盖这类特殊场景。

既然已经定义成自定义组件的 html , 是不是就该自定义组件下的内容是纯字符串??? 由用户去处理.

没有自定义组件的 内容中的 标签 html 标签则过滤?

非自定义组件中的 内容遵循 CommonMark 规范,这里的行为符合预期.

@meet-student 理解诉求。建议以配置项(如 rawCustomComponents)显式开启,而非默认行为:

  1. 规避 Breaking Change:默认开启会破坏嵌套自定义组件的现有解析逻辑;
  2. 保持规范兼容:普通 HTML 仍遵循 CommonMark,仅命中 components 白名单的标签按纯文本处理。

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/x-markdown/src/XMarkdown/interface.ts (1)

205-209: 💤 Low value

导出类型顺序变更似乎非必要。

导出列表的类型顺序发生了变化(StreamStatusTokenTokensXMarkdownProps 位置调整),但这一改动对功能无影响,看起来可能是意外的格式化或编辑。建议确认这是否为有意变更,若非必要可恢复原顺序以减少差异。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/x-markdown/src/XMarkdown/interface.ts` around lines 205 - 209, The
exported type order was unintentionally changed; revert the export list in this
module so the types appear in the original sequence (put StreamStatus, Token,
Tokens, XMarkdownProps back in their previous order and keep TailConfig where it
originally was) by editing the export statement that references StreamStatus,
TailConfig, Token, Tokens, and XMarkdownProps to match the original ordering to
minimize diff noise.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/x-markdown/src/XMarkdown/interface.ts`:
- Around line 205-209: The exported type order was unintentionally changed;
revert the export list in this module so the types appear in the original
sequence (put StreamStatus, Token, Tokens, XMarkdownProps back in their previous
order and keep TailConfig where it originally was) by editing the export
statement that references StreamStatus, TailConfig, Token, Tokens, and
XMarkdownProps to match the original ordering to minimize diff noise.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cbf27ee8-77b4-4538-b8e1-828b72cd5460

📥 Commits

Reviewing files that changed from the base of the PR and between f1cd335 and 047ddc0.

📒 Files selected for processing (9)
  • packages/x-markdown/src/XMarkdown/__tests__/Parser.test.ts
  • packages/x-markdown/src/XMarkdown/__tests__/index.test.tsx
  • packages/x-markdown/src/XMarkdown/core/Parser.ts
  • packages/x-markdown/src/XMarkdown/index.tsx
  • packages/x-markdown/src/XMarkdown/interface.ts
  • packages/x/docs/x-markdown/components.en-US.md
  • packages/x/docs/x-markdown/components.zh-CN.md
  • packages/x/docs/x-markdown/examples.en-US.md
  • packages/x/docs/x-markdown/examples.zh-CN.md
✅ Files skipped from review due to trivial changes (3)
  • packages/x/docs/x-markdown/components.en-US.md
  • packages/x/docs/x-markdown/examples.en-US.md
  • packages/x/docs/x-markdown/examples.zh-CN.md

@meet-student
Copy link
Copy Markdown
Member Author

done

lint 问题在 #1905 这个pr 修

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants